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I 

BIOSY NTHETIC CONSTRUCTS OF TGF-B 

Background of the Invention 

This invention relates to biosynthetic 
peptide constructs of transforming growth 
factor-beta, to. synthetic genes encoding polypeptides 
having transforming growth factor-beta-like 
biological activity, to methods of producing such 
synthetic genes using recombinant DNA technology, and 
to the use of such biosynthetic peptide constructs as 
regulators of cell proliferation and growth. 

Transforming growth factor-beta (TGF-B) is a 
multifunctional peptide regulator of activity 
involving cellular or tissue response to injury or 
stress. This factor has the ability to stimulate 
cell proliferation in cells of mesenchymal origin, 
while also being able to inhibit the growth of 
epithelial cells, embryonic fibroblasts, endothelial 
cells, and T and B lymphocytes. In addition, TGF-B 
has a number of other regulatory activitites which 
appear to be related uniquely to the specialized 
function of a particular cell type. For example, 
TGF-B stimulates the production of matrix components, 
e.g., inhibiting the synthesis and secretion of 
proteolytic enzymes which act on tnese components, 
thereby regulating the synthesis and degradation of 
extracellular matrix (for a review, see, e.g., Sporn 
et al. (1987) J. Cell Biol. 105:1039-1045). 
TGF-B— type activities have been identified in many 
normal fibronectin- and collagen-producing 
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fibroblasts, as well as tissues such as kidney 
(Roberts et al. (1983) Biochem. 22:5692-5698), 
placenta (Frolik et al. (1983) Proc. Natl. Acad. Sci. 
(USA) fi&:3676-3680) , and platelets (Childs et al. 

/1Q»9\ Drnn Ar'a'* /TTKfi \ 7Q • R91 O 511 C • nri/l 

Assoian et al. (1983) J. Biol. Chem. 2Sfi: 7155-7160) , 
as well as in tumor cells (see, e.g., Roberts et al. 
(1980) Proc. Natl. Acad, Sci. (USA) 77:3494-3498) . 

There are four known molecular 
configurations of TGF-B, each having an apparent 
molecular weight of about 25,000 daltons. Three of 
these species result from homodimeric (TGF-B1, 
TGF-B2) and heterodimeric (TGF-B1.2) combinations of 
the monomer ic subunits, Bl and B2. The fourth 
species is a homodimer of a B3 subunit. Each subunit 
is processed from a precursor of about 390 amino 
acids, and the mature subunit protein includes 
approximately 112 amino acids of its carboxy 
terminus. The Bl and B2 subunits have about a 70% 
amino acid sequence homology in their N-terminal 
portions, and are highly conserved between species. 
The deduced amino acid sequence of TGF-B3 shares 
about 80% homology with types Bl and B2, with many of 
the differences being conservative substitutions. 
TGF-B1 was originally isolated from human platelets 
and placenta (EP 0128849), and bovine kidney (Roberts 
et al. ibid . ) . TGF-B2 was originally identified as 
cartilage-inducing factors (CIF) isolated from bovine 
bone (US 4,774,228). TGF-B1.2 has been found in 
porcine platelets and other cells which coexpress the 
Bl and B2 chains (Cheifetz et al. (1988) J. Biol. 
Chem. 2£3.:10783-10789) . TGF-B3 has been identified 
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in both human and chicken (Duke et al. (i 98 5) Proc. 
Natl. Acad. Sci. (USA) £5:4715-4719) . 

The TGF-B * s belong to a larger gene family, 
the members of which encode structurally similar 
proteins that have similar regulatory activities 
(reviwed in Massague (1987) Cell 4.2:437-438). 
Included in this family are: (1) V gl, a protein 
involved in mesoderm formation during Xenons 
development; (2) decapentaplegic complex (DPP), a 
polypeptide encoded by a Prosophj ] a gene responsible 
for development of the dorsoventral pattern in the 
embryo; (3) OPl, a region of a native osteogenic 
protein sequence encoded by exons of a genomic DNA 
sequence retrieved by applicants; (4) cartilage 
inducing factors (CIFs) isolated from bovine bone (US 
4,774,228); (5) mammalian osteogenic bone matrix 
proteins CBMP-2a, CBMP-2b, and CBMP-3, discovered by 
applicants (see WO89/01453) ; and (6) 6-inhibin-a and 
b, gonadal proteins that suppress pituitary secretion 
of follicle stimulating hormone. All of these 
proteins are believed to dimerize during refolding, 
and are inactive when reduced to the monomeric form. 
In addition, many include portions of a common 
precursor peptide. 

Identification of the regulatory activities 
of the proteins in the TGF-B family, and the 
elucidation of their amino acid sequences, have 
resulted in research efforts directed to the 
production of these proteins by recombinant means. 
For example, EP 0200341 discloses nucleic acid 
sequences encoding native TGF-B and precursors 
thereof which can be expressed in a host eukaryotic 
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cell transformed therewith. EP 0150572 discloses the 
manufacture of structural genes coding for TGF-B1 and 
analogs thereof and the expression of these genes in 
microorganisms. However, the design and expression 
of consensus protein constructs having considerable 
sequence homology with a number of the proteins in 
the TGF-B family, and displaying TGF— B— like activity, 
has heretofore not been contemplated. 

Accordingly, it is an object of this 
invention to provide novel analogs of TGF-B having 
TGF-B biological activities. Another object is to 
provide an efficient method of producing novel, 
active TGF-B analogs. Yet another object is to. 
provide genes encoding novel, non-native, TGF-B 
species and methods for their production using 
recombinant DNA techniques. Another object is to 
provide novel truncated forms of TGF-B and structural 
designs for proteins with TGF-B biological activity. 
A further object is to provide metaods of regulating 
cell proliferation using TGF-B analogs. 

These and other objects and features of the 
invention will be apparent from the description, 
drawings, and claims which follow. 



WO 91/05565 



-5- 



PCT/US90/06006 



Summary of T nv^f^ n 

It has been discovered that forms of native 
TGF-B which have been truncated at the N-terminus, 
and which have fewer than the native number of 
cysteine residues demonstrate TGF-B- like biological 
activity, including the ability to induce an 
antiproliferative effect on mammalian epithelial 
cells in vitrp . it has also been unexpectedly 
discovered that truncated analogs of other 
structurally similar proteins in the TGF-B family 
known to have unrelated biological activities also 
possess this TGF— B-like activity. These discoveries 
enabled the design and construction of DNAs encoding 
novel, non-native protein constructs which 
individually and combined are capable of inhibiting 
the proliferation of mammalian epithelial cells in 
culture. 

Thus, in one aspect, the invention comprises 
a truncated TGF-B analog produced by expression of 
recombinant DNA in a host cell and capable of 
inducing an antiproliferative effect in mammalian 
epithelial cells is yitso.. This protein construct 
includes two polypeptide chains, each including a 
biologically active domain, and each having fewer 
than 9, and preferably 6 or 8, cysteine (Cys) 
residues. It may further be characterized as being 
unglycosylated. 

In another aspect, the invention comprises a 
protein produced by expression of recombinant DNA in 
a prokaryotic host cell and including a pair of 
polypeptide chains of fewer than about 112 amino 
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acids each. The s quence of amino acids in each 
chain is sufficiently duplicative of the sequence of 
TGF-B such that the protein is capable of inducing an 
anti-proliferative effect on mammalian epithelial 
cells in vifcto. Preferably, the polypeptide chain 
contains fewer than 9, and more preferably 6 or 8, 
cysteine residues, and further, may be 
unglycosylated. 

The cysteine residues are involved in the 
formation of intra- and inter-chain disulfide bonds 
(folding), the correct formation of which results in 
an active construct having TGF— B-like activity. in 
eucaryotes, the synthesis and proper folding of the 
protein can occur at least within those cells known 
to express TGF-B; in prokaryotic calls, folding must 
be performed in vitro, a difficult feat in that any 
number of combinations of disulfide linkages exist 
between two polypeptide chains, each having less than 
9, and preferably 6 or 8 cysteine residues. An 
important aspect of this invention is the discovery 
that truncated constructs of the type described 
herein may be post-translationally modified and 
folded (by oxidation) in vitro to produce TGF-B— like 
activity. 

Several forms of TGF-B monomers are known in 
nature, Bl, B2, and B3 . Investigation of the 
properties and structure of these native forms 
enabled the development of a rational design for 
non-native (i.e., not known to be expressed in 
nature) , truncated protein constructs which also are 
capable of differentially regulating cell 
proliferation in various cell types. Further, upon 
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examination of a number of unrelated proteins with 
some amino acid sequence homology, it was 
unexpectedly discovered that they, too, possess 
TGF-B-like activity. 



Based on this knowledge, a series of 
consensus DNA sequences were designed with the goal 
of producing active TGF-B analogs. The sequences 
were based on partial amino acid sequence data 
obtained from native TGF-B species and from observed 
homologies with genes reported in the literature 
encoding proteins of the TGF-B family (including Vgl 
and DPP) , or on the amino acid sequences they encode, 
having a presumed or demonstrated developmental 
function. Several of the biosynthetic consensus 
sequences have been expressed as fusion proteins in 
prokaryotes, purified, cleaved, refolded, applied to 
a mammalian in yjjfcro. assay system, and shown to have 
TGF-B-like anti-proliferative activity. 

In preferred aspects of the invention, the 
proteins encoded by these consensus sequences include 
the generic amino acid sequences: 



10 20 30 40 50 

cjocxxLYxxFxxDXGwxewxxxpxGYXAxxcxGxcpxxxxeexexxxeexexxx 

60 70 80 90 100 

XXLXXXXXPXXXeXXXCCVXXXXXXXXXXXXXXXXXXXXXXBXXMXVXXCXCX 
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and 



10 20 30 40 50 

LYxxFxxDXGwxewxxxpxGYXAxxcxGxcpxxxxeexexxxeexexxx 

60 70 80 90 100 

- — «-n-n.u^f^ww W AAAAAAAAAAAAAAAiWiAAAAAOAAi ¥ JA V AAUiUA 

wherein the letters indicate the amino acid residues 
of standard single letter code, each H X" 
independently represents one of the naturally 
occurring amino acid or a derivative thereof , and 
each "6" independently represents an amino acid or a 
peptide bond. 

The currently preferred active peptide 
constructs comprise amino acid sequences derived from 
the three monomer ic subunits of TGF-B (1, 2, and 3), 
DPP, and OP1, The amino acid sequences of these 
proteins are set forth in FIGURE 1 relative to the 
sequence of TGF-S1. Preferred amino acid sequences 
within the foregoing generic sequences are: 

10 20 30 40 50 

CCVRQLYIDFRKDLGWK— WIHEPKGYHANFCLGPCPYIWS — L-DTQ — Y-SKV 
LP KR N A A S HR 

Q V Y S RAT T 

KKHVE - VQN IAQMYYE PLTEI NGSN AIL 
RRHS S DDVLDYHKF ADHF S V 



60 70 80 90 100 

LALYNQHNPGAS-AAPCCVPQALEPLPIVYYVGRKPKVEQL-SNMIVRSCKCS 
STIE S SD TLIKTI K 
G L V 
QT VHS E D-IPL TKMS ISM F DNNDNV LRHYE A DE G R 

NN K V KA Q DSVA LNDQST KN QE T VG 



WO 91/05565 



PCT/US90/06006 



-9- 



and 



10 20 30 40 50 

LYIDFRKDLGWK— WIHEPKGYHANFCLGPCPYIWS L-DTQ Y-SKV 

PKR N A A S H R 

Q V YS RAT T 

VE - V QN IA Q M Y Y E PLTEI NGSN AIL 
S DDVLDYHKF ADHF S V 

60 7 <> 80 90 100 

IALYNQHNPGAS— AAPCCVPQALEPLPIVYYVGRKPKVEQL-SNMIVRSCKCS 
STIE S SD TLIKTI K 

G L v 

QT VHS E D-IPL TKMS ISM F DNNDNV LRHYE A DE G R 

NN K V KA Q DSVA LNDQST KN QE T VG 

wherein, at each position where more than one amino 
acid is shown vertically, any one of the amino acids 
shown may be used alternatively in various 
combinations, and --- and " — - represent a peptide 
bond. Note that the numbering of amino acids is 
selected solely for purposes of facilitating 
comparisons among alternative sequences. These 
generic sequences have fewer than 9, and preferably 
6-8 cysteine residues where inter- and/or 
intramolecular disulfide bonds can form, and contain 
other critical amino acids which influence the 
tertiary structure of the proteins. Similar 
structural features are found in the above named 
known proteins of the TGF-B family whose amino acid 
sequences previously have been published. However, 
of these only the TGF-B species (1, 2, and 3) have 
been described as capable of inducing an 
anti-proliferative effect in mammalian epithelial 
cells in vitro . 



WO 91/05565 

PCT/US90/06006 

-10- 



Particular useful sequences include analogs 
having the following amino acid sequences: 



TGF-B1 

10 20 30 40 crt 

CCVRQLYIDFRKDLGVJKWIHEPKGYHANFCLGPCPYIWSLDTQYSKVLAL 

70 3Q Q(\ 

YNQHNPGASAAPCCVPQALEPLP I VYYVGRKPKVEQLSKMI YRSCKCS ^ ^ 

TGF-B2 

~~ 10 20 30 40 «;n 

CCLRPLYI DFKRDLGWKWI HEPKGYNANFCAGACPYLWS SDTQHSRVL SLY 

60 70 80 90 inn 

NTINPEASASPCCVSQDLEPLTILYYIGKTPKIEQLSNMIVKSCKCS; 

TGF-S3- 

20 30 4Q en 

CCVRPLYIDFRQDLGWKWVHEPKGYyANFCSGPCPYLRSADTTHSTVLGL 
60 70 80 90 mn 

YNTLNPEASASPCCVPQDLEPLTILYWGRTPKVEQLSHMVVKSCKCS ; 

Vgl 

10 20 30 40 cn 

CKKRHLYVEFKDVGWQNWVIAPQGYMANYCYGECPyPLTEILNGSNHAIL 

60 70 80 90 inn 

QrLVHSIEPEDIPLPCCVPTKMSPVAMLYLNDQSTVVLKl^QEra^GCGCR ; 

DPP 

10 20 30 40 en 

CRRHSLYVDFSDVGWDDW1VAPLGYDAYYCHGKCPFPLADHFNSTNHAVV 

QTLVNNNNPGKVPKACCVPTQLDS I Sl^FYDNNDNVVLRHYENMAVDECGCR ; 
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and 

10 20 



CCVRQLYIDFK^LGWKW^PKGY^F^GACPYLWsLTQHSRVli 0 

60 70 80 un 

LYNTAHPEASAAPCCVPQDLEPLTILYyVGRTPKVEQLSNMWKSCKCS . 



or more truncated analogs such as: 
TGF-B1 

10 20 30 40 r„ 

LYIDFRKDLCWKWIHEPKGYHANFCLGPCPYIWSLDTQYSKVLAL 

YNQHNPC^I^cCVPQAZEPLPIVYYVG^KVEQLS^IVRSCKCSr 



TGF-B2 

LYIDFKRDLGWKW^PKCWNANFCAGACPYLWSSD^^ 

60 70 80 Qn -i 

NTIHPEASASPCCVSQDLEPLTIIiYYIGKTPKIEQLSNMIVKSCKCS; 



TGF-B3 

10 20 30 40 en 

LYIDFRQDLGWKWVHEPKGYYANFCSGPCPYLRSADTTHSTVLGL 

™ti.npeaILpccvpq^^^ 

Vgl 

10 20 30 40 tn 

LYVEFKDVGWQNWVIAPQGYMANYCYGECPYPLTEILNGSNHAIL 

60 70 no q f\ 

QTLVHS ^EPEDIPLPCCVPTKMSPVAMLYLNDQST\r^K^QEMTVVGCGCR ; 
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DPP 

10 20 30 40 50 

LYVDFSDVGWDDWIVAPLGYDAYYCHGKCPFPLADHFNSTNHAW 

60 70 80 90 100 

QTLVNNNNPGKVPKACCVPTQLDS I SMLFYDNNDNWLRHYENMA.VDECGCR ; 



and 

10 20 30 40 50 

LYIDFKRDLGWKWVHEPKGYAANFCAGACPYLWSADTQHSRVLALYK 

60 70 80 90 100 

TANPEASAAPCCVPQDLEPLTILYYVGRTPKVEQLSNMWKSCKCS. 



The name given to each of these sequences designates 
the natural source DNA sequence encoding the amino 
acid sequence which, as far as applicants are aware, 
exhibits the most homology with the recited TGF-B 
analog. 

The invention further includes DNA sequences 
encoding these constructs and a prokaryotic host cell 
engineered to express these DNA sequences. In a 
perferred aspect of the invention, the prokaryotic 
host cell (e.g. F^ sslL) is tranfected with a vector 
including the TGF-B-encoding DNA sequence. The 
transformed cell is cultured to express the protein 
which is then purified and activated by oxidation in 
yjLfc£o_. The protein so treated has the ability to 
induce an anti-pro lifer at ive effect on cultured 
mammalian epithelial cells. 

The biosynthetic constructs disclosed herein 
may be used to regulate cellular activities such as 
proliferation and growth. In this regard, these 
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constructs have wide potential clinical applications, 
for example, by controlling the proliferation of 
various tumor cell lines, or by enhancing the growth 
rate of T and B lymphocytes in immunosuppressed (e.g. 
Acquired Immunodef f iciency Syndrome (AIDS) patients. 
The constructs also may be used in cell cultures to 
modulate growth of various types of eucaryotic cells. 
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Brief DescripMnn of thg nra^-in^ 

The foregoing and other objects of this 
invention, the various features thereof, as well as 
the invention itself, may be more fully understood 
from the following description, when read together 
with the accompanying drawings, in which: 

FIGURE 1 is a comparison of the amino acid 
sequence of various proteins in the TGF-B family to 
TGF-B1; and 

FIGURES 2A and 2B are representations of a 
DNA sequence and corresponding amino acid sequence of 
a modified trp-LE leader sequence, two FB domains of 
protein A, an Asp-Pro cleavage site, and (A) the 6 
Cys TGF-fil sequence, and (B) the 8 Cys TGF-B1 
sequence. 
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DescriDt-inn 

Nucleic acid sequences encoding truncated 
TGF-B analogs were designed based on sequence data 
reported in the literature, codons inferred from 
known amino acid sequences, and observations of 
partial homology with known genes of the TGF-B 
family. These sequences have been refined by 
comparison with the sequences present in certain 
regulatory genes from the TGF-B family. 

The naturally occurring proteins of the 
TGF-B family are made as precursors, with a large and 
poorly conserved N-terminal domain and a 
characteristic C-terminal domain in which a pattern 
of 7 cysteines (Cys) residues is highly conserved. 
In addition to these Cys residues, certain other 
amino acids are found in members of the family very 
nearly in the same relative positions in sequence as 
set forth below: 

_ 10 20 30 40 50 

CXXXXLYXXFXXDXGWX9WXXXPXGYXAXXCXGXCPXXXX0exexXXeeXGXXX 
60 70 80 90 100 

J^XXXXXPXXXQXXXCCVXXXXXXXXXXXXXXXXXXXXXXeXXMXVXXCXCX 

wherein each X independently represents a naturally 
occurring amino acid, and 9 or 69 represents an amino 
acid or peptide bond. 

The N-terminal sequence of mature TGF-B and 
other related proteins contains a variable number of 
Cys residues which appear to be crosslinked among 
each other or with a residue of another amino acid 
chain, but not to Cys residues in the C-terminal 
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domain. Maturation of the precursor to the mature 
form of these proteins occurs by trypsin-like 
cleavages between the precursor and the mature 
protein, and possibly also within the precursor form 
as other similar cleavage sites are present therein. 

All members of the Vgl-related subgroup of 
the TGP-B family (including Vgl, DPP, OP1, CBMP-2a, 
CBMP-2b, and CBMP-3) share the feature of two basic 
residues (i.e., Lys-Lys, Arg-Arg, Arg-Lys) following 
the first Cys in the conserved C-terminal domain. 
The conserved double basic residues may represent 
another secondary maturation site. Cleavage by 
trypsin or related protease releases a C-terminal 
domain containing only 6 Cys residues. Since the 
precursor region of TGF-B contains up to 5 Cys 
residues which are not crosslinked to the C-terminal 
domain, the first of the 7 Cys residues may not be 
crosslinked to the C-terminal domain either. 
Therefore 6 Cys residues appear to be sufficient for 
a properly folded C-terminal domain. 

In view of this disclosure, skilled genetic 
engineers can design and synthesize genes which 
encode a number of appropriate amino acid sequences. 
These genes can be expressed in various types of 
eucaryotic cells but, for reasons of efficiency are 
preferably produced in prokaryotic host cells, 
thereby providing large quantities of active 
synthetic proteins such as truncated analogs, 
muteins, fusion proteins, and other constructs, all 
mimicking the biological activity of native TGF-B, 
including the ability to induce ail anti-prolif erative 
effect on cultured mammalian epithelial cells. 
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More specifically, the DNA sequences 
designed according to the above criteria and logic 
were constructed using known techniques involving 
assembly of oligonucleotides manufactured in a DNA 
synthesizer. The sequences may be expressed using 
well established recombinant DNA tschnologies in 
various prokaryotic host cells, and the expressed 
proteins may be cleaved from precursors, oxidized, 
and refolded in vitro for biological activity. This 
approach has been successful in producing a number of 
novel protein constructs not found in nature (as far 
as applicants are aware) which have TGF-B-like 
activity, i.e., the ability to induce an 
anti-proliferative effect on cultured mammalian 
epidermal cells. 

The design and production of such 
biosynthetic proteins, and other material aspects 
concerning the nature, utility, how to make, and how 
to use the subject matter claimed herein will be 
further understood from the following non-limiting 
examples, which constitute the best method currently 
known for practicing the various aspects of the 
invention. 

EXAMPLES 

1. Consensus Sequence Design 



Published amino acid sequences for TGF-B2, 
TGF-S3, Vgl, and DPP-C were used to determine which 
amino acids showed strong homology with the TGF-B1 
sequence. FIGURE 1 compares the amino acid sequences 
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of these proteins with the sequence of TGF-B1 (* 
denotes a match) , and TABLE 1 summaries the extent of 
homology. 

TABLE 1 

comparison no. of matches % homology 

TGF-B2/TGF-B1 78/115 67 . 8 

TGF-B3/TGF-B1 85/115 73 . 9 

DPP/TGF-B1 34/115 29.6 

Vgl/TGF-Bl 35/115 29-6 



In determining an appropriate consensus 
amino acid sequence for TGF-B analogs, from which 
encoding nucleic acid sequences can be determined, 
the following points were considered: (1) the known 
amino acid sequence of natural source TGF-B1, 2, and 
3 are ranked highest; (2) where an amino acid in the 
sequence matches for all three proteins, it is used 
in the synthetic gene sequence; (3) matching amino 
acids in DPP and Vgl are used; (4> if Vgl or DPP 
diverge, but either one were matched by TGF-B1, 2, or 
3, this matched amino acid was chosen; and (5) where 
all sequences diverge, the amino acid residue alanine 
was chosen, provided that the secondary structure is 
maintained. 
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Using these criteria, the preferred sequence 

is : 

10 20 30 40 50 

CCVRQLYIDFKRDLGWKWVHEPKGYAANFCAGACPYLWSADTQHSRVIiA 

60 70 80 90 

LYNTANPEASAAPCCVPQDLEPLTILYYVG^^^ 

In addition, the first consensus sequence 
was designed to preserve 8 of the disulfide 
crosslinks and the apparent structural homology among 
the related proteins, while the second more highly- 
truncated consensus sequence was designed to preserve 
6 disulfide bonds. That sequence is: 

10 20 30 40 50 

LYI DFKRDLGWKWVlIEPKGYAANFCAGACPyLWS ADTQHSRVIjAIjYN 

60 70 80 90 100 

TANPEASAAPCCVPQDLEPLTILYYVGRTPKVECLSNMVVKSCKCS . 

2. Gene Preparation and Expression 

The synthetic genes designed using the 
criteria set forth above are produced by assembly of 
chemically synthesized oligonucleotides. 15-100mer 
oligonucleotides are synthesized on a Biosearch DNA 
Model 8600 Synthesizer, and are purified by 
polyacryl amide gel electrophoresis (PAGE) in 
Tris-Borate-EDTA buffer (TBE) . The DNA is then 
electroeluted from the gel. Overlapping oligomers 
are phosphorylated by T4 polynucleotide kinase, and 
then ligated into larger blocks which may also be 
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purifed by PAGE, Alternatively ,. natural gene 
sequences and cDNAs may be used for expression. The 
two resulting genes are shown as tne latter portion 
of the fusion sequences in FIGURES 2A and 2B. The 
sequence shown in 2A is a truncated form of 2B; five 
amino acids at the N-terminus have been eliminated. 

To enable the expression of the synthetic 
gene shown in FIGURE 2A in an ^ c:3li host, the gene 
is modified by cassette mutagenesis. The N-terminus 
is replaced up to the Clal site wich a hinge region 
that provides for release of the TGF-B protein from 
the leader, preceded by a BarriHI site for attachment 
to leader peptides. The modified gene is then 
attached to an FB-diiner leader at zhe BamHI site. 
The complete fusion gene is shown in FIGURE 2A. 

The fusion gene is then inserted as an EcoRI 
to PstI fragment into an expression vector based on 
PBR322 and containing a synthetic tryptophan (trp) 
promoter/operator and a modified txp-LE (MLE) leader 
(which is similar to the one described by Huston et 
al. in Proc. Natl. Acad. Sci. (USA) (1988) 
85:5879-5883, but having only a single EcoRI site at 
the hinge of the MLE leader) . The vector is opened 
at the EcoRI and PSTI restriction sites, and a 
FB-FB/TGF-B gene fragment is then inserted 
therebetween, where FB is fragment B of 
Staphylococcal Protein A. The resulting expression 
vector includes the TGF-S gene to a fragment encoding 
FB. 
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3. Production of Active Analogs 

The protein constructs are expressed in E. 
BQlL host strain JM101 (e.g.) grown in minimal medium 
(M9) after starvation for trp and induction by 
indoacrylic acid (IAA) . The cells are lysed and the 
inclusion bodies collected by differential 
centrifugation. The fusion proteins are purified 
from the inclusion bodies by urea or quanidine 
solubilization. The FB sequence is then chemically 
cleaved from the TGF-B protein construct at the hinge 
region of the fusion protein. The hinge region has 
the sequence Asp-Pro-Asn-Gly which can be cleaved at 
the Asp-Pro site with dilute acid, or at the Asn-Gly 
site with hydroxyl amine. The resulting cleavage 
products are passed through a Sephacryl-200HR column 
which separates most of the uncleaved fusion products 
from the TGF-B analogs. 

Protein refolding is performed under the 
conditions of 50 mM Tris-HCl, pH 8.0, 3 M guanidine 
hydrochloride (GuHCl), 10 mM dithiothreitol (DTT) , 
and 1-10 mM oxidized glutathione. 

4 . TGF-B Activity Assay 

This assay is based on the ability of TGF-B 
to inhibit DNA synthesis in the mink lung epithelial 
cell line, ATCC no. CCL 64. A confluent culture of 
CCI.-64 maintained in Eagle's minimum essential medium 
(EMEM) supplemented with 10% fetal bovine serum 
(FBS), 200 units/ml penicillin, and 200 ixg/ml 
streptomycin, is used to seed a 48-well cell culture 
Plate at a cell density of 200,000 cells per well. 
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When the culture becomes confluent , the media is 
replaced with 0.5 ml of EMEM containing 1% FBS and 
penicillin/streptomycin. The culture is incubated 
for 24 hours at 37°C. The TGF-B test samples in EMEM 
containing 0.5% FBS are then added to the wells , and 
the cells are then incubated for another 18 hours. 
After incubation, 1.0 yCl of 3 H-thymidine in 10 ]xl is 
added to each well. The cells are incubated for 4 
hours at 37°C. The media is then removed and the 
cells washed once with ice-cold phosphate-buffer 
saline. The DNA is precipitated by adding 0.5 ml of 
10% TCA to each well and incubating at room 
temperature for 15 min. The cells are then washed 
three times with ice-cold distilled water and lysed 
with 0.5 ml 0.4 M NaOH. The lysate from each well is 
then transferred to a scintillation vial and the 
radioactivity recorded using a scintillation counter 
(Smith-Kline Beckman) . 

Each test sample is assayed in triplicate. 
A TGF-B control is included in each assay. The 
inhibition activity of each sample is expressed as 
the 50% effective dose (ED50), which is defined as 
the amount of material in ng/ml required to induce 
50% reduction in maximal incorporation of 
3 H-thymidine . 

The invention may be embodied in other 
specific forms without departing from the spirit or 
essential characteristics thereof. The present 
embodiments are therefore to be considered in all 
respects as illustrative and not restrictive, the 
scope of the invention being indicated by the 
appended claims rather than by the foregoing 
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aescription, and all changes which come within the 
meaning and range of equivalency of the claims are 
therefore intended to be embraced therein. 

What is claimed is: 
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1. A truncated transforming growth factor-beta 
construct produced by expression of recombinant DNA 
in a host cell, capable of inducing an 
anti-proliferative effect in mammalian epithelial 
cells iii vitro./ and comprising two polypeptide 
chains , each of said chains comprising an active 
domain having fewer than 9 cysteine residues. 

2. The construct of claim 1 wherein each of 
said active domains comprises 8 cysteine residues. 

3 » A protein produced by expression of 

recombinant DNA in a prokaryotic host cell, said 
protein comprising a pair of polypeptide chains, each 
of which has fewer than about 112 amino acids in a 
sequence sufficiently duplicative of the sequence of 
transforming growth factor-beta such that said 
protein is capable of inducing an anti-proliferative 
effect on mammalian epithelial cells i& vitro , 

4. The protein of claim 3 wherein each of said 
polypeptide chains comprises less than 7 cysteine 
residues. 

5. The protein of claim 4 wherein each of said 
polypeptide chains comprises 6 cysteine residues. 

6. The protein of claim 1 or 3 further 
characterized by being unglycosylated. 
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7. The protein of claim 1 or 3 comprising the 

amino acid sequence: 

10 20 30 40 so 

CXXXXLYXXFXXDXGWXewXXXPXGYXAXXCXGXCPXXXXeeXBXXXeSXeXXX 

60 70 80 90 100 

xxi^aaacxpxxxexxxccvxxxxxxxxxxxx 

wherein each "X" independently represents one of the 
naturally occurring amino acid or a derivative 
thereof, and each "6" independently represents an 
amino acid or a peptide bond. 

8. The protein of claim 1 or 3 comprising the 

amino acid sequence : 

10 20 30 46 50 

LYXXFXXDXGWXewXXXPXGYXAXXCXGXCPXXXXeexeXXXGGXGXXX 

60 70 80 90 100 

3DCLXXXXXPXXX9XXXCCVXXXXXXXXXXXXXXXXXXXXXX9XXMXVXXCXCX 

wherein each "X" independently represents one of the 
naturally occurring amino acid or a derivative 
thereof, and each independently represents an 
amino acid or a peptide bond. 
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9. The protein of claim 7 comprising the amino 

acid sequences : 

10 20 30 40 «;n 

CCVRQLYIDFRKDLGWK-WIHEPKGYHANFCLGPCPYIWS--L-DTQ--Y-SKV 
q v NAA S HR 

Shs ^ <T V K V 0 " * Y ^ E PLTEI^NGSN AIL 
RRHS S DDVLDYHKF ADHF S V 

60 70 80 90 100 

LALYNQHNPGAS — AAPCCVPQALEPLP IVYYVGRKPKVEQL— SNMI VRSCKCS 
STIE S SD TLIKTI K 
G L v 

QT VH 2» E B* IPL TmS ISM F DNHDNV LRHYE A DE G R 

NN K V KA Q DSVA LNDQST KN QE T VG 

wherein, at each position where more than one amino 
acid is shown, any one of said amino acids shown may 
be in that position, and "-" and - represent a 
peptide bond. 

10. The protein of claim 9 comprising the amino 

acid sequence: 



10 20 30 40 50 

CCVRQLYIDFRKDLGWKWIHEPKGYHANFCLGPCPYIWSLDTQYSKVLAL 

60 70 80 90 100 

YNQHNPGASAAPCCVPQALEPLPIVYYVGRKPKVEQLSNMIVRSCKCS . 

11. The protein of claim 9 comprising the amino 

acid sequence: 



10 20 30 40 50 

CCLRPLYIDFKRDLGWKWIHEPKGYNANFCAGACPYLWSSDTQHSRVLSLY 

60 70 80 90 100 

NTINPEASASPCCVSQDLEPLTILYYIGKTPKIEQLSNMIVKSCKCS. 
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12. The protein of claim 9 comprising the amino 

acid sequence: 



10 20 30 40 50 

CCVRPLYIDFRQDLGWKWVHEPKGYYANFCSGPCPYLRSADTTHSTVLGL 

60 70 80 90 100 

YKTIjNPEASASPCCVPQDLEPIjTILYYVGRTPKVEQLSNMVVKSCKCS . 

13 • The protein of claim 9 comprising the amino 

acid sequence: 



10 20 30 40 50 

CKKRHLYVEFKDVGWQNWVIAPQGYMANYCYGECPYPLTEILNGSNHAILQ 

60 70 80 90 100 

TLVHS IEPEDIPLPCCVPTKMSPVAMLYLNDQ . 

14, The protein of claim 9 comprising the amino 

acid sequence: 



10 20 30 40 50 

CRRHSLYVDFSDVGWDDWI VAPLGYDAYYCHGKCPFPLADHFN STNHAWQ 

60 70 80 90 100 

TLVNNNNPGKVPKACCVPTQLDS I SMLFYDNNDNWLRHYENMAVDECGCR < 



15 * The protein of claim 9 comprising the amino 

acid sequence: 



10 20 30 40 50 

CCVRQLYI DFKRDLGWKWVHEPKGYAANFCAGACPYLWS ADTQHSRVLALYN 

60 70 80 90 100 

TANPEASAAPCCVPQDLEPLTILYYVGRTPKVEQLSNM7VKSCKCS . 
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16. The protein of claim 8 comprising the amino 

acid sequences: 

10 20 30 40 50 

LYIDFRKDLGWK-WIHEPKGYHANFCLGPCPYIWS—L-DTQ^-Y-SKV 
P KR N A A S H R 

TO U V Y S RAT T 

VE - V QN IA Q M Y Y E PLTEI NGSN AIL 
S DDVLDYHKF ADHF S V 

FF S 1 * S E SHI GT G LSF ST I 

T G S AY VP AS T 

E AR A NSYMNAT 

P F L 
Q D 

K A SE S SF Q MPKS KP 

60 70 80 90 100 

LALYNQHNPGAS - AAPCCVPQ ALEPLP I VYYVGRKPKVEQL - SNMI VRS CKC S 
STIE S SD TLIKTI K 
G L v 

QT VHS E D-IPL TKMS ISM F DNNDNV LRHYE A DE G R 

NN K V KA Q DSVA LNDQST KN QE T VG 

NHYRMRGHSPFANL S R M GQ I K DI E 

Q LN GTKVN I T F EY VP A 

FT ANAV S KR AH 

v - E E EK D 

S Y 

I RA GWPG E E V A 

wherein, at each position where more than one amino 
acid is shown, any one of said amino acids shown may 
be in that position, and "-" and " — " represent a 
peptide bond. 



17. The protein of claim 16 comprising the amino 

acid sequence: 



10 20 30 40 50 

LYIDFRKDLGWKWIHEPKGYHANFCLGPCPYIWSLDTQYSKVLAL 

60 70 80 90 100 

YNQHNPGASAAPCCVPQALEPLPIVYYVGRKPKVEQLSNMIVRSCKCS. 
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18. The protein of claim 16 comprising the amino 

acid sequence: 



10 20 30 40 50 

LYI DFKRDLGWKWIHEPKGYNANFCAGACPYLWSSDTQHSRVLSIiY 

60 70 80 90 100 

NTINPEASASPCCVSQDLEPLTILYYIGKTPKIEQLSNMIVKSCKCS. 

19. The protein of claim 16 comprising the amino 

acid sequence: 



10 20 30 40 50 

LYIDFRQDLGWKWVHEPKGYYANFCSGPCPYLRSADTTHSTVLGL 

60 70 80 90 100 

YNTLNPEASASPCCVPQDLEPLTILYYVGRTPKVEQLSNMVVKSCKCS. 

20. The protein of claim 16 comprising the amino 

acid sequence: 



10 20 30 40 50 

LYVEFKDVGWQNWVIAPQGYMAireCYGECPYPLTEILNGSNHAIL 

60 70 80 90 100 

QTLVHS IEPED I PLPCCVPTKMSPVAMLYLNDQSTVVLKNYQEMTVVGCGCR . 

21. The protein of claim 16 comprising the amino 

acid sequence: 



10 20 30 40 50 

LYVDFSDVGWDDWIVAPLGYDAYYCHGKCPFPLADHFNSTNHAVVQ 

60 70 80 90 100 

TLVNNNNPGKVPKACCVPTQLDS I SMLFYDNNDNVVLRHYENMAVDECGCR , 
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22. The protein of claim 16 comprising the amino 
acid sequence: 

10 20 30 41 50 

LYIDFKRDLGWKWVHEPKGYAANFCAGACPYLWSADTQHSRVLALYN 

60 70 80 90 100 

TANPEASAAPCCVPQDLEPLTILYYVGRTPKVEQLSHMWKSCKCS. 

23. A DNA sequence encoding the amino acid 
sequence of one of the chains of the construct of 
claim 1 or 3. 

24. a prokaryotic cell engineered to express the 
DNA sequence encoding the amino acid sequence of one 
of the chains of the construct of claim 1 or 3. 

25 . The prokaryotic cell of claim 24 wherein 
said cell is E_t coli . 
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26 • A m thod of producing a protein construct 

having the activity of transforming growth 
factor-beta, said method comprising the steps of: 

(a) transforming a proka ryotic host cell 
with a vector comprising zhe DNA sequence of 
claim 23; 

(b) culturing said transformed host cell to 
express said protein construct; 

(c) purifying said protein construct; and 

(d) activating said protein construct by 
oxidation in vitro to produce a two chain 
disulf ide-linked transforming growth 
factor-beta analog, said activated protein 
construct having an anti-prolif erative 
effect on mammalian epithelial cells in 
vitro- 

27. The method of claim 26 wharein said 
activated protein comprises a dimer. 

28. The method of claim 26 wherein said 
activated protein comprises fewer than 9 cysteine 
residues. 

29. The method of claim 28 wherein said 
activated protein comprises 6-8 cysteine residues. 



30. The method of claim 28 wherein said 

activated protein comprises 8 cysteine residues. 
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31, The method of claim 28 wharein said 
activated protein comprises 6 cysteine residues «, 

32. The method of claim 28 wherein said 
activated protein is unglycosylated. 
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FIGURE 1 (eotn-'rt,) 
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TGF-B1 APCCVPQALEPLPIVYYVGRKPKVEQL-SNMIVRSCKCS 
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FIGURE 2A 



10 20 30 40 50 
ATGAAAGCAATTTTCGTACTGAAAGGTTCACTGGACAGAGATCTGGACTC 
MKAIFVLKGSiDRDLDS 
mle leader > Bglli 

60 70 BO 90 100 

TCGTCTGGATCTGGACGTTCGTACCGACCACAAAGACCTGTCTGATCACC 
R LDLDVRTDHKDL S D H 



110 120 130 140 150 

TGGTTCTGGTCGACCTGGCTCGTAACGACCTGGCTCGTATCGTTACTCCCG 
LVLVDLARHDLARIVTP 

Sail Smal 

160 170 180 190 200 

GGTCTCGTTACGTTGCGGATCTGGAATTCATGGCTGACAACAAATTCAA 
GSRYVADLE FMAD NKFN 

EcoRI Fb-Fb leader — 

210 220 230 240 250 

CAAGGAACAGCAGAACGCGTTCTACGAGATCTTGCACCTGCCGAACCTGA 

KEQQNAFYE ILHLPNL 
-> Mlul Bglli BspMI 

260 270 280 290 300 

ACGAAGAGCAGAAGGACGGCTTCATCCAGAGCTTGAAGGATGAGCCCTCT 
N E E QKDGF I QSLKDE P S 

310 320 330 340 350 

CAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGGC 
QSANLL ADAKKLNDAQA 
Nhel Fspl 

360 370 380 390 400 

ACCGAAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAAGG 
PKS DQGQFMADNKFHK 

410 420 430 440 450 

AACAGCAGAACGCGTTCTACGAGATCTTGCACCTGCCSAACCTGAACGAA 
EQQNAFYE ILHI.PNLKE 
Mlul Bglli BspMI+ 
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FIGURE! ?ft f rnr , r . <1|1 



460 470 

CATC 

* A QSLKDEP s""q W 's 
510 520 e _ A 

Nhel AK KLHD A QAP 

Fspl <-leaaer 

iinge-« — tg 
610 620 



IGGA^CA^c^CTTACCAT^CTTCT^GOGCC^ 

Mstll BstEII Apal 

EcoO 

■WPS"? ?^*»«b«5! 

Seal BstXI 

ACAATCA^T^CCCG^CCTAGC^CTCCGTG^TGTTCCA^Soc 

^ Nhel ' * g tu ? 

760 770 7B0 7Qft 

CTTGGAACCGCTGCCGATCG^AnACGTCGGCC^GCCTAA^? 

*™l E«?I " * Mstll ' 



tgaacagctgtctaacgtga^gtgcgcagttgcaag?gStcttaag^cc 

K ° ImiBa^I 
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FIGURE 2B 

10 20 30 40 50 

ATGAAAGCAATTTTCGTACTGAAAGGTTCACTGGACAGAGATCTGGACTC 
MKAIFVLKGSLDRDLDS 
I MLE leader > Bglll 



60 70 80 90 100 

TCGTCTGGATCTGGACGTTCGTACCGACCACAAAGACCTGTCTGATCACC 
RLDLDVRTDHKDLSDH 



210 120 130 140 150 

TGGTTCTGGTCGACCTGGCTCGTAACGACCTGGCTCGTATCGTTACTCCC 
G 

LVLVDLARNDLARI VT P 
Sail 

Smal 



160 170 180 190 200 

GGTCTCGTTACGTTGCGGATCTGGAATTCATGGCTGACAACAAATTCAA 
G S RYVADLEFMADK K F N 

EcoRI | -Fb-Fb leader — 



210 220 230 240 

250 

CAAGGAACAGCAGAACGCGTTCTACGAGATCTTGCACCTGCCGAACCTGA 

KEQQNAFYEI LHLPNL 
-> Mlul Bglll BspMI 



260 270 280 290 300 

ACGAAGAGCAGAAGGACGGCTTCATCCAGAGCTTGAAGGATGAGCCCTCT 
NEEQKDGF IQSLKDEPS 



310 320 330 340 350 

CAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGGC 
QSANLLADAKKLNDAQA 
Nhel Fspl 



360 370 380 390 400 

ACCGAAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAAGG 
PKSDQGQFHADNKFNK 
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FIGURE 2B ffaont-'fl,) 



410 420 430 440 450 

AACAGCAGAACGCGTTCTACGA^^r*^r^«v^^/9m^« » - . 

EQQHA PYEILHI.PMLNE 
Mlul Bglll BspMI+ 

460 470 480 490 500 

GAGCAGAAGGACGGCTTCATCCAGAGCTTGAAGGATGAGCCCTCTCAGTC 
EQKDGFI QSLKDEPSQS 

510 520 530 540 550 

TGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGGCACCGA 
A N L L ADAKKLNDA QAP 

HheI Pspl <-leader 

560 570 580 590 600 

AGGATCCTAATGGGTGCTGCGTGCGTCAGCTGTACATCGATTTCCGTAAA 

I P N G C C V R Q L Y I D P R K 
— I ===hmge«:«c — TGF-beta with 8 cys 
BamHI 1 



660 670 680 690 700 

GACCTGGGTTGGAAGTCCGTACATCTGGTCTCTGGATACCCAGTACTCCAAG 
DLGWK PYIW S L D T Q Y S K 

Seal BstXI 

710 720 730 740 750 

GTGCTGGCTCTGTACAATCAGCATAACCCGGGGGCTAGCGCAGCTCCG 
VLA LYNQ H KPGASAA P 

Smal Nhel 

760 770 780 790 800 

TGCTGTGTTCCACAGGCCTTGGAACCGCTGCCGATCGTCTATTACGTCGGC 
CCVPQA LEPLPIVYYVG 
StuI pvul EagI 
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FIGURE 2B <cant'a.\ 



810 820 830 840 650 

.CGTAAGCCTAAGGTTGAACAGCTGTCTAACGTGATTGTGCGCAGTTGCA 

RKPKVEQLSHVIVR8C 
Mstll PvuII pgpi 



860 

AGTGCTCTTAAGGATCC 
K C S * 

Af UIBamHI 
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